WCER Working Paper No. 2009-6 

August 2009 



Tracking and Inequality: 

New Directions for Research and Practice 

Adam Gamoran 

Department of Sociology/ 

Wisconsin Center for Education Research 
University of Wisconsin-Madison 
gamoran(Sssc. wisc.edu 



Wisconsin Center for Education Research 

School of Education • University of Wisconsin-Madison • http://www.wcer.wisc.edu/ 




Copyright © 2009 by Adam Gamoran 
All rights reserved. 

Readers may make verbatim copies of this document for noncommercial purposes by any means, 
provided that the above copyright notice appears on all copies. 

WCER working papers are available on the Internet at http://www.wcer.wisc.edu/publications/ 
workingPapers/index.php . Recommended citation: 

Gamoran, A. (2009). Tracking and inequality: New directions for research and practice 
(WCER Working Paper No. 2009-6). Madison: University of Wisconsin-Madison, 
Wisconsin Center for Education Research. Retrieved [e.g., August 20, 2009,] from 
http ://w w w . wccr.wisc.cdu/publications/workinsiPapcrs/papers.php 



Tracking and Inequality: 

New Directions for Research and Practice 

Adam Gamoran 

For more than a century, educators and researchers have debated the merits of separating 
students into different tracks, classes, and groups according to their purported interests and 
abilities (for historical perspectives, see Powell, Farrar, & Cohen, 1985; Oakes, Gamoran, & 
Page, 1992; Loveless, 1998, 1999; Oakes, 2005). The practice, known as tracking and ability 
grouping in the U.S. and streaming and setting in the U.K., is intended to create conditions in 
which teachers can efficiently target instruction to students’ needs. 1 Despite this intended benefit, 
tracking has been widely criticized as inegalitarian because students in high tracks tend to widen 
their achievement advantages over their low-track peers, and because measures of school 
performance that are commonly used to assign students to tracks typically coincide with the 
broader bases of social disadvantage such as race/ethnicity and social class, leading to 
economically and/or ethnically segregated classrooms. Yet tracking has been highly resistant to 
lasting change and remains in wide use in various forms in the U.S., the U.K., and school 
systems around the world. 

Although struggles over tracking involve instructional and political challenges that play 
out in schools and classrooms, the persisting debate reflects not only local concerns but also 
broader tensions inherent in education systems (Oakes et al., 1992). On the one hand, schools are 
charged with providing all students with a common framework of cognitive and social skills 
essential for full participation in the civic and economic activities of adult society. On the other 
hand, schools are structured to sort and select students for different trajectories aligned with their 
varied orientations and capacities. This ongoing tension between commonality and 
differentiation is at the heart of the tracking debate: Is the purpose of schooling to provide all 
students with a common socialization? Or is it to differentiate students for varied futures? The 
former aim is consistent with mixed-ability teaching, whereas the latter is consistent with 
tracking, and the debate has no simple resolution because school systems embody both goals. 

Building on past studies, recent work on tracking has advanced in three areas that 
indicate promising new directions for research and practice. First, new international scholarship 
has extended knowledge about the consequences of tracking for student achievement to contexts 
beyond the U.S. and U.K., where most prior research had been conducted. Second, recent studies 
of attempts to reduce or eliminate tracking and ability grouping have yielded important insights 



This paper was prepared for the The Routledge International Handbook of the Sociology of Education, edited by 
Michael W. Apple, Stephen J. Ball, and Luis Arrnand Gandin. New York: Routledge, in press. The author is grateful 
for helpful research assistance from Michelle Robinson. 

1 U.S. writers often use the terms tracking and ability grouping interchangeably. For brevity, I use the single term 
tracking to capture all the various forms of structural differentiation for instruction. When distinguishing among 
different forms, I use the term tracking to refer to the practice of dividing students into separate classes (or clusters 
of classes) for all of their academic subjects, and the term ability grouping to mean the division of students into 
classes on a subject-by-subject basis. This use parallels the meanings of the terms streaming and setting in the U.K. I 
use the term within-class ability grouping to refer to the use of instructional groups within class for a particular 
subject and between-school grouping to refer to systems in which students are assigned to separate schools targeted 
to different futures based on academic performance. 



3 




Tracking and Inequality 



about why tracking is resistant to change and how some of the obstacles to detracking may be 
surmounted. Third, a new wave of research on classroom assignment and instruction has pointed 
toward approaches that, while not resolving the tension between commonality and 
differentiation, may capture the benefits of differentiation for meeting students’ varied needs 
without giving rise to the consequences for inequality that commonly accompany tracking and 
ability grouping. These findings, in turn, call for new research and experimentation in practice. 

Before turning to these latest findings, I summarize the earlier literature on the effects of 
grouping and tracking on student achievement. This research has been well covered in prior 
reviews (e.g., Kulik & Kulik, 1982; Slavin, 1987, 1990; Gamoran & Berends, 1987; Oakes et al., 
1992; Harlen & Malcolm, 1997; Hallam, 2002; Gamoran, 2004), but I begin with it here because 
it sets the stage for the promising work of the present and the new directions for the future. Thus, 
the remainder of this paper is divided into four sections: (a) a review of findings about tracking 
and achievement that links work from the 1970s, 1980s, and 1990s to updated studies in the 
same vein; (b) a discussion of recent international research on tracking, both between and within 
schools; (c) an analysis of new studies of efforts to reduce or eliminate tracking; and (d) a 
conclusion calling for new research and practice based on the latest findings. 

Tracking and Achievement: Increased Inequality Without Benefits to Productivity 

Following Gamoran and Mare (1989), one may distinguish two possible consequences of 
tracking for achievement: it may affect productivity (the overall level of achievement in the 
school or class), and it may affect inequality (the distribution of achievement across the different 
tracks, classes, or groups). Although not all studies have reached the same conclusions about 
these outcomes, the weight of the evidence indicates that tracking tends to exacerbate inequality 
with little or no contribution to overall productivity. This occurs because gains for high achievers 
are offset by losses for low achievers. A compelling example of this pattern comes from 
Kerckhoff s (1986) study of ability grouping between and within schools in England and Wales. 
Kerckhoff used data from the National Child Development Study, which followed for more than 
30 years all children born in the U.K. in the first week of March in 1958. He examined secondary 
school achievement in reading and mathematics among students enrolled in schools for high 
achievers (grammar schools), low achievers (secondary modem schools) and those of widely 
varying achievement levels (comprehensive schools). He also compared students assigned to 
high, middle, low, and mixed- ability classes within the different types of schools. Comparisons 
between and within schools told a consistent story: There were no overall benefits to average 
achievement in contexts that differentiated students for instruction as compared with mixed- 
ability contexts. However, sorting students into selective schools and classes was associated with 
increasing gaps between high and low achievers over time (see also Kerckhoff, 1993). 

The comparison of tracking to mixed-ability teaching has received less attention in the 
U.S. because tracking has been nearly universal at the secondary level (Loveless, 1998). 
However, comparisons of ability-grouped and mixed- ability classes in middle school 
mathematics and science (Hoffer, 1992) and English (Gamoran & Nystrand, 1994) in the U.S. 
have yielded the same pattern reported by Kerckhoff (1986, 1993). National survey analyses in 
the U.S. have also demonstrated that over the course of high school, students assigned to high 
and low tracks grow farther and farther apart in achievement (e.g., Heyns, 1974; Alexander, 



4 




Tracking and Inequality 



Cook, & McDill, 1978; Gamoran, 1987a, 1992; Gamoran & Mare, 1989; Lucas & Gamoran, 

2002 ). 



Because track location is correlated with traditional bases of socioeconomic 
disadvantage, tracking not only widens achievement gaps but also reinforces social inequality 
(Oakes et al., 1992; Lucas & Berends, 2002). In contrast to socioeconomic status, which has 
direct effects on track assignment, race and ethnicity affect track assignment indirectly. Minority 
students whose test scores and socioeconomic backgrounds match those of Whites are no less 
likely to be placed in high tracks (Gamoran & Mare, 1989; Lucas & Gamoran, 2002; Tach & 
Farkas, 2006). However, because minority students tend to reach high school with lower test 
scores and less advantaged socioeconomic circumstances, tracking works to the disadvantage of 
minority students and contributes to achievement gaps. 

As the demographic makeup of U.S. schools has changed, new patterns of inequality 
associated with tracking have become more salient. With regard to language minorities, Callahan 
(2005) argued that schools often conflate limited proficiency in English with limited ability to 
master academic content. As a result, English language learners are tracked into classes with 
modified curricula that are less rigorous than those of regular classes, which prevents these 
students from gaining access to advanced instruction even as their language skills develop. While 
Callahan supported these assertions with a study of a rural California school, Paul (2005) 
reached a similar conclusion based on her study of five diverse urban schools. Paul noted that 
enrollment in Algebra 1, the gateway to the college-preparatory curriculum, was stratified by 
race and ethnicity, with Asian American and White students enrolled in higher proportions and 
African American and Hispanic students enrolled in lower proportions. When English language 
learners enrolled in the same levels of algebra as fluent English speakers, they had similar rates 
of college-preparatory course work. Foreshadowing this work, Padilla and Gonzales (2001) 
argued that one reason recent Mexican immigrants outperform second-generation students is that 
the immigrants have spent less time in low tracks in U.S. schools. 

New forms of tracking in the U.S. have exhibited patterns of inequality comparable to 
those of earlier forms. Using high school transcripts from a national sample of students, Lucas 
(1999) showed that students were grouped on a subject-by-subject basis rather than by broad 
curricular programs. Nevertheless, students’ course levels tended to correlate across subject 
areas, and this more subtle version of tracking still resulted in achievement inequality. Mitchell 
and Mitchell (2005) demonstrated that multitrack, year-round schools also tended to stratify 
students by social origins. Both Lewis and Cheng (2006) and Mickelson and Everett (2008) 
found that the transformation of vocational education into career and technical education, though 
accompanied by greater emphasis on academic work within technical courses of study, still 
resulted in stratified class enrollments. 

Generally, elementary and middle schools have seen a pattern of increasing inequality 
similar to that observed at the high school level (e.g., Rowan & Miracle, 1983; Hoffer, 1992; 
Gamoran, Nystrand, Berends, & LePore, 1995). Until recently, national data were available only 
at the secondary level, so it was not possible to examine the generalizability of patterns of 
inequality associated with elementary school ability grouping. However, recent analyses of data 
from a national sample of children who entered kindergarten in 1998 have confirmed the pattern 
of widening gaps for within-class reading groups in kindergarten (Tach & Farkas, 2006). Using 



5 




Tracking and Inequality 



later waves of the same data, Lleras and Rangel (2009) reported similar findings for between- 
class ability grouping in Grades 1 and 3. 

In an exception to the general pattern, Slavin (1987) reported (based on a synthesis of 
research on elementary school grouping) positive effects of within-class grouping in 
mathematics for students in low-ranked as well as high-ranked groups. Slavin also noted that 
when students were regrouped for specific subjects, rather than being tracked for the entire 
school day, ability grouping had positive effects for students at all achievement levels. On the 
basis of these findings, Slavin proposed that elementary school ability grouping can have 
positive effects when (a) assignment is based on criteria relevant to the subject, (b) students can 
be moved from one group to another as appropriate to their progress, and (c) curriculum and 
instruction are differentiated to meet the needs of students assigned to the different groups. 

Slavin’s conclusions have recently been affirmed by Connor and her colleagues (Connor, 
Morrison, Fishman, Schatschneider, & Underwood, 2007; Connor et al., 2009). Connor’s work 
shows that small reading groups can be used effectively to tailor reading instruction to students’ 
needs. In a randomized comparison, Connor et al. (2007) reported that students taught by 
teachers who arranged students into reading groups according to carefully assessed student 
performance levels, and who aimed instruction at students’ specific needs, performed much 
better by the end of first grade than those taught by teachers who did not have access to this 
systematic approach to assigning students and differentiating instruction. Though based on less 
precise evidence, Tomlinson et al. (2003) advanced similar claims about the value of within- 
class differentiation of instruction as a strategy for effective teaching of students with varied 
interests and skills. 

Challenges in Measuring the Effects of Tracking 

Two methodological challenges have confronted researchers studying the impact of 
tracking and ability grouping on student achievement. One challenge has been to measure 
accurately students’ group and track locations. At the secondary level, research from the 1970s 
and 1980s often relied on students to report whether their curricular programs could best be 
described as academic/college-preparatory, vocational, or general. This social-psychological 
measure of tracking was useful as an indicator of students’ perceptions but did not necessarily 
represent students’ actual learning opportunities. Lucas (1999) developed a structural measure of 
track location by using students’ transcripts to identify tracks based on the courses students had 
taken. Lucas and Gamoran (2002) showed that structural and social-psychological dimensions of 
tracking had independent effects on student achievement, but both contributed to achievement 
gaps. Other researchers have used network analysis techniques to identify tracks through the 
configuration of courses in which students enroll (Friedkin & Thomas, 1997; Heck, Price, & 
Thomas, 2004), reaching similar conclusions about tracking and inequality. More recent studies 
have also uncovered inequality using teacher reports to distinguish among ability groups at high, 
middle, and low levels (Carbonaro, 2005; Tach & Farkas, 2006). 

The second methodological challenge has been to distinguish the effects of track 
assignment from the effects of preexisting differences among students assigned to different 
tracks. Obviously, students in high and low tracks are on different achievement trajectories to 
begin with; that is how they come to be located in different tracks. All the analyses discussed 



6 




Tracking and Inequality 



here have controlled for prior achievement and social background, but due to unreliability and 
measurement error, not all preexisting conditions may have been captured by the controls, and 
the potential for selectivity bias remains. Researchers have endeavored to respond to this 
challenge in two ways. First, a few studies, mainly prior to 1970, used random assignment to 
tracked or untracked settings to rule out selectivity bias (Slavin, 1987, 1990). These studies 
yielded widely varying estimates of track effects that centered around zero. Because they 
provided little information on what was going on inside the tracks, it is difficult to assess the 
generalizability of these small and long-ago experiments. In at least some cases of zero effects, 
teachers designed instruction and curriculum to be the same across tracks, in contrast to the real 
world where tracking is typically accompanied by curricular and instructional differentiation. 
These findings led Gamoran (1987b) to argue that the effects of tracking depend on how it is 
implemented, a conclusion later supported by both case study (Gamoran, 1993) and survey 
analyses (Gamoran, 1992). 

Second, researchers have used econometric techniques to mitigate selectivity bias. 
Gamoran and Mare (1989) estimated endogenous switching regressions that model track 
assignment and track effects simultaneously, allowing for correlated errors among unobserved 
predictors of assignment and outcomes. Their results, which focused on mathematics 
achievement and high school completion for the high school class of 1982, indicated that the 
pattern of increasing inequality observed in standard regression analyses with rich controls was 
upheld in the more complex technique. Lucas and Gamoran (2002) replicated these results for 
the high school class of 1992 as well as the class of 1982, and with course-based as well as self- 
reported indicators of track location. Again, the main findings were upheld. However, Betts and 
Shkolnik (2000), who estimated both propensity models and two-stage least squares regressions 
models of track effects on mathematics achievement, concluded that the differential effects of 
tracking for students in high and low tracks were much smaller than reported in earlier studies 
that relied on simple regressions. Figlio and Page (2002) similarly called into question the 
negative effects of tracking on secondary school math achievement on the basis of two-stage 
least squares regression models. While it is premature to conclude that tracking is not harmful to 
low achievers, these studies, combined with the early experimental research, suggest the effects 
may be smaller than is typically assumed. Since Gamoran and Mare focused on broad curricular 
tracking while Betts and Shkolnik and Figlio and Page examined between-class ability grouping, 
the findings may also indicate that the latter is less consequential for inequality than the former. 



2 The models estimated by Betts and Shkolnik (2000) and Figlio and Page (2002) rely on very strong assumptions, 
so their results should be interpreted with particular caution. Betts and Shkolnik’s conclusions rest on comparisons 
of classes at similar ability levels as reported by teachers but located in schools that differed on whether the principal 
reported that tracking was used for mathematics. Yet teacher reports of class ability levels may reflect between-class 
ability grouping regardless of the principal’s report. Figlio and Page (2002) used as instruments for track assignment 
indicators that, on the face of it, seem far-fetched: two- and three-way interactions between the number of courses 
required for graduation, the number of schools in the county, and the fraction of voters in the county who voted for 
President Reagan in 1984. Weak instruments would undermine the estimates of track effects and could bias them 
towards zero. 



7 




Tracking and Inequality 



Mechanisms of Track Effects on Achievement 

With few exceptions, the evidence indicates that tracking tends to magnify inequality. 
Why is that the case? Conceptually, researchers have identified mechanisms of social 
comparison as well as differentiated instruction, but empirically it appears that instructional 
variation across tracks and groups at different levels is the more prominent reason for increases 
in achievement gaps between tracks. A number of studies have concluded that students in high 
tracks encounter more challenging curricula, move at a faster pace, and are taught by more 
experienced teachers with better reputations, while students in low tracks encounter more 
fragmented, worksheet-oriented, and slower-paced instruction provided by teachers with less 
experience or clout (for reviews, see Oakes et al., 1992; Gamoran, 2004). These findings have 
emerged at the elementary, middle, and high school levels. Instructional differences reflect not 
only what teachers do in classrooms, but also how students respond. A recent finding along these 
lines comes from the work of Carbonaro (2005), who demonstrated that achievement diverges in 
part because high-track students put forth more effort on their school work than low-track 
students. While this finding reflected, in part, low-track students’ responses to instruction that 
was less intellectually stimulating than the instruction given to high-track classes, it also 
stemmed from differences that students brought with them to class. 

Other new examples of instructional mediation of the effects of tracking come from both 
hypothesis testing and interpretive research. In a study of 64 middle and high school English 
classes, Applebee, Langer, Nystrand, and Gamoran (2003) reported greater use of discussion- 
based approaches to literature instruction in high-ability than in low-ability classes, and this 
difference accounted for just over one third of the effect of ability group assignment on writing 
performance. Discussion-based approaches included (a) using authentic questions (questions 
with no pre-specified answer) and “uptake” questions (questions building on prior statements), 

(b) encouraging open discussion, (c) drawing in multiple perspectives (“envisionment building”), 
and (d) initiating conversations that connected different curricular topics. Watanabe (2008) 
reported parallel instructional differences based on in-depth analyses of 68 hours of classroom 
observation in two teachers’ language arts classes. In high-ability classes, she found more 
engagement with challenging and meaningful curricula, more writing assignments in more 
diverse genres, and more feedback from teachers, as contrasted with more emphasis on test 
preparation in low tracks. 

Findings that instructional differentiation accounts for much of the effect of tracking have 
led some observers to conclude that tracking per se does not generate inequality, but rather 
inequality has emerged because of the way in which tracking has been implemented (e.g., 
Hallinan, 1994). If instruction in low tracks could be effectively geared toward students’ needs, 
this argument states, then tracking might mitigate rather than exacerbate inequality. While 
reasonable in principle, this goal has proven difficult to accomplish in practice, and there are few 
examples of effective instruction in low-track classes (for exceptions, see Gamoran, 1993; 
Gamoran & Weinstein, 1998). At the same time, it is important to acknowledge that most studies 
of ability grouping and curriculum tracking have found that high-achieving students tend to 
perform better when assigned to high-level groups than when taught in mixed-ability settings. 
Proponents of tracking tend to emphasize the benefits of high-level classes for high- achieving 
students with little attention to implications for inequality, while critics tend to focus on the 
inequality without acknowledging the effects for high achievers. As a result, proponents and 



8 




